1 Terms of re-use

1.1 License

CC-BY-SA unless otherwise noted.

1.2 Citation

2 Purpose

To extract and visualise tweets and re-tweets of #birdoftheyear OR #boty in September/October 2018.

Borrowing extensively from https://github.com/mkearney/rtweet

The analysis used rtweet to ask the Twitter search API to extract ‘all’ tweets containing the #birdoftheyear OR #boty hashtags in the ‘recent’ twitterVerse.

It is therefore possible that not quite all tweets have been extracted although it seems likely that we have captured most recent human tweeting which was the main intention. Future work should instead use the Twitter streaming API.

3 Load Data

Load the data (pre-collected using https://github.com/dataknut/hashTagR/blob/master/dataProcessing/getBoTY.R).

The data has:

4 Analysis

4.1 Tweets and Tweeters over time

Number of tweets and tweeters

Figure 4.1: Number of tweets and tweeters

Figure 4.1 shows the number of tweets and tweeters in the data extract by day. The quotes, tweets and re-tweets have been separated. Looks to me like there’s quite a lot of activity at weekends…

If you are in New Zealand and you are wondering why there are no tweets today (2018-10-01) the answer is that twitter data (and these plots) are working in UTC and (y)our today hasn’t started yet in UTC - but don’t worry, all the tweets are here. It’s just our old friend the timezone… :-)

4.2 Who’s tweeting?

Next we’ll try by screen name.

Figure 4.2: N tweets per day by screen name

Figure 4.2 is a really bad visualisation of all tweeters tweeting over time. Each row of pixels is a tweeter (the names are probably illegible) and a green dot indicates a few tweets in the given day while a red dot indicates a lot of tweets. We’ve used plotly::ggplotly() so you can hover over the data points but it’s still pretty messy.

So let’s re-do that for the top 50 tweeters so we can see their tweetStreaks (tm)…

Top tweeters:

Table 4.1: Top 15 tweeters (all days)
screen_name nTweets
birdoftheyear 745
Forest_and_Bird 224
testeeves 210
vote4kaki 185
NatForsdick 178
coolbiRdpics 174
mifflangstone 170
newzealandbirds 117
jackcraw57 111
thebushline 101
lpneumophila 81
sgalla32 80
This_NZ_Life 80
hugobrown 78
zaichishka 72

And their tweetStreaks are shown in 4.3

N tweets per day minutes by screen name (top 50, reverse alphabetical)

Figure 4.3: N tweets per day minutes by screen name (top 50, reverse alphabetical)

Any twitterBots…?

4.3 Which birds are mentioned the most (by hashtag)

This is very quick and dirty but… Figure 4.4 plots the number of tweets for each concatenated hashtag string (non-separated hashtags) after removing tweets which have variants of #birdOfTheYear and #boty as the only hashtags and selecting the top 50. Tweets mentioning birds without using a #<birdName> hashtag will not show up; birds #mentioned in tweets with a lot of varying other #hashtags will be under-represented etc etc. This is a really imperfect measure of just about anything so #YMMV.

Top 50 hashtag strings

Figure 4.4: Top 50 hashtag strings

Table 4.2 shows a slightly more intelligible table of the same data summarised across all days to date. Note that this is not a true count of the mentions of a particular #hashtag - we would need to seperate the hashtags and then count unique hashtags for that. Work in progress…

Table 4.2: Number of tweets per hashtag string (sorted by nTweets)
hashtags nTweets
takayay 1299
gannet 132
Pukeko|Kiwi|Australia 42
Vote4Kakī|kakī 39
MenInBlack|BIB 36
takahey|votetakahē 31
TeamKakī|kakī|genetic|genomic 30
birds|birdwatching 28
kereru|kakī 26
voterowi 25
StanyAgaty|KłamaćJakMorawiecki|PiS|DośćKłamstw|Podpaski 22
littlepenguin 22
TeamKakī 19
DammitGannet 18
TeamKaki 18
kotāre|kingfisher 18
TeamBlackPetrel 17
doadivebombbro|dammitgannet 16
VoteRuru 15
dammitgannet|doadivebombbro 13
DopestBirds 12
seabirds|albatross 12
BrotherOfTheYear|TokyoInternationalFilmFestival2018|BrotherOfTheYearPH 10
kakī 10
votetomtit|itsthetits 10
evidence|conservation|kakī 7
votetakahē|makebirdsroundagain 7
Kakī 6
KŌTARE 6
TeamKakī|dopest 6
BIB 5
Penguin 5
StanyAgaty|KłamaćJakMorawiecki|PiS|DośćKłamstw 5
Vote4Kakī 5
makebirdsroundagain|manyshapesareround|livelaughtakahē|takayay|votetakahē 5
DidYouKnow|Travel|adventure 4
fallcolours|woodpecker|SundayFunday 4
kakī|BraidedRiver 4
BrotherOfTheYear|BrotherOfTheYearPH 3
TeamTawaki 3
teamrockhopper 3
KotaPrezesa 2
Kaczyński 1
animatethelivings|canlılarıcanlamdıralım|muslumtekin_ogr|petsofinstagram|pets|petstagram|petslovers|bird 1

5 About

Analysis completed in 11.592 seconds ( 0.19 minutes) using knitr in RStudio with R version 3.5.1 (2018-07-02) running on x86_64-apple-darwin15.6.0.

A special mention must go to https://github.com/mkearney/rtweet (Kearney 2018) for the twitter API interaction functions.

Other R packages used:

References

Dowle, M, A Srinivasan, T Short, S Lianoglou with contributions from R Saporta, and E Antonyan. 2015. Data.table: Extension of Data.frame. https://CRAN.R-project.org/package=data.table.

Kearney, Michael W. 2018. Rtweet: Collecting Twitter Data. https://cran.r-project.org/package=rtweet.

R Core Team. 2016. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Sievert, Carson, Chris Parmer, Toby Hocking, Scott Chamberlain, Karthik Ram, Marianne Corvellec, and Pedro Despouy. 2016. Plotly: Create Interactive Web Graphics via ’Plotly.js’. https://CRAN.R-project.org/package=plotly.

Wickham, Hadley. 2009. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. http://ggplot2.org.

Wickham, Hadley, Jim Hester, and Romain Francois. 2016. Readr: Read Tabular Data. https://CRAN.R-project.org/package=readr.

Xie, Yihui. 2016a. Bookdown: Authoring Books and Technical Documents with R Markdown. Boca Raton, Florida: Chapman; Hall/CRC. https://github.com/rstudio/bookdown.

———. 2016b. Knitr: A General-Purpose Package for Dynamic Report Generation in R. https://CRAN.R-project.org/package=knitr.